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ABSTRACT 


The  possibility  of  using  image  coding  techniques  to 
facilitate  stereoscopic  depth  determination  for  scene  analy- 
sis has  been  investigated  over  the  past  year.  A promising 
technique,  run  length  coding,  was  applied  to  scene  analysis. 
A method  of  encoding  which  helps  ensure  regional  coherency 
is  described  in  this  paper.  In  addition,  work  to  enable 
operations  such  as  feature  extraction  and  feature  correla- 
tion in  a run  length  coded  data  structure  is  described.  The 
use  of  these  operations  in  determining  depth  information 
from  two  views  of  a scene  is  outlined.  Finally,  the  design 
of  an  image  acquisition  system  to  enable  implementation  of 
the  depth  determination  method  is  briefly  discussed. 
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I.  Introduction 

The  purpose  of  the  research  effort  reported  here  has  been 
to  develop  methods  for  three-dimensional  scene  analysis. 

The  particular  approach  used  involves  the  encoding  of  two 
images  of  the  scene,  a derivation  of  depth  information  about 
the  scene,  and  finally  an  analysis  of  this  three-dimensional 
data  in  terms  of  some  primitive  descriptions  of  scenes. 

II.  Encoding  of  Images 

There  has  been  a great  deal  of  research  into  efficient 
coding  of  images  for  transmission  purposes  [1,2,3].  Since  the 
input  of  images  is  a key  component  of  the  scene  analysis 
process  an  effort  was  made  to  examine  these  coding  techniques 
.for  transmission  to  determine  their  usefulness.  Of  the  methods 
investigated,  the  technique  of  run  length  coding  was  found  to 
be  the  most  promising.  A method  of  coding  was  developed,  as 
outlined  in  [4],  which  is  quite  amenable  to  hardware  imple- 
mentation and  which  results  in  a data  storage  area  reduction 
which,  at  worst,  is  1:1  but  may  yield  4:1  to  8:1  reductions 
with  little  degradation  in  the  image  quality  necessary  for 
analysis.  It  should  be  noted  here  that  this  reduction  is  in 
storage  area  words,  not  in  bits  as  commonly  used  for  trans- 
mission purposes  and  that  full  spatial  and  intensity  resolu- 
tion is  maintained. 

Basically  the  encoding  scheme  proceeds  on  a top  to  bottom, 
left  to  right  basis  (the  usual  for  raster  scan  systems)  pixel 
to  pixel.  If  the  horizontal  gradient  measure  does  not  exceed 
some  threshold  the  pixel  is  added  to  the  run  and  assumes  the 
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intensity  of  the  previously  encoded  pixel.  If  the  horizontal 
gradient  measure  exceeds  the  threshold  then  the  previous  run 
is  terminated  and  a new  one  begun.  The  new  run  is  assigned 
the  intensity  of  the  encoded  pixel  above  it  in  the  previous 
line  if  the  vertical  gradient  measure  does  not  exceed  a given 
threshold,  otherwise  the  run  is  assigned  the  intensity  of  the 
current  pixel  itself.  The  latter  case  is,  in  effect,  the 
start  of  a new  region  of  the  image.  This  vertical  check 
assures  a two-dimensional  cohesiveness  to  what  is  basically 
a one-dimensional  encoding  process. 

III.  Operations  in  Encoded  Images 

A data  structure  for  images  encoded  as  above  is  reported 
in  [4],  The  structure  consists  of  a list  of  run  length  end- 
points, a list  of  run  length  intensities,  and  a small  pointer 
table  to  increase  access  ease.  A method  of  access  to  run 
lengths  located  at  and  near  a particular  Cartesian  coordinate 
is  shown.  The  use  of  this  method  enables  arithmetic  or 
logical  operations  to  be  performed  within  a region  of  the  image, 
It  should  be  pointed  out,  however,  that  while  this  scheme  can 
be  used  for  operations  on  regularly  shaped  regions,  its  most 
natural  manipulation  leads  to  irregularly  shaped  areas.  This 
can  lead  to  some  conceptual  difficulty  in  the  use  of  such  a 
data  structure. 


Region  growing  and  feature  extraction  algorithms  have 
been  developed  to  aid  in  using  this  data  structure.  These 
include  a region  growing  technique  to  identify  evenly  shaded 
regions  of  the  image,  a feature  extraction  algorithm  which 
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indicates  locations  within  the  image  where  local  structure 
suggests  a corner-like  arrangement,  a measure  of  similarity 
between  two  such  extracted  features  for  use  in  matching  and 
comparison  operations,  and  finally,  a procedure  to  check  for 
and  trace  an  edge  extending  from  one  of  these  features  to 
another. 

IV.  Derivation  of  Depth  Information 

In  [5]  a review  of  previously  tried  methods  of  depth 
determination  using  both  single  and  multiple  views  is  pre- 
sented. A relatively  straightforward  approach  to  depth 
determination  for  the  case  of  planar  polygonal  objects  is 
outlined.  This  approach  uses  the  encoded  data  structure 
described  above  along  with  the  operations  described  in  [4], 

A target  feature  is  extracted  from  one  image  and  then 
compared  to  several  candidate  features  extracted  from  the 
second.  Next,  a feature  located  along  an  edge  from  the 
previous  feature  is  extracted  and  compared  to  candidates. 

The  existence  of  an  edge  between  the  new  candidate  and  the 
previously  selected  match  is  used  to  determine  the  confidence 
of  match  and  helps  prevent  false  matching.  This  process  is 
extended  to  define  polygons  which  are  then  merged  through  the 
use  of  the  derived  depth.  Finally,  the  polygon  description 
in  three  space  is  used  to  separate  objects  from  each  other 
and  from  the  background. 

V.  Implementation 

A flexible  image  acquisition  and  display  system  [6]  was 
designed  and  has  been  partially  constructed.  It  allows  the 


input  of  a section  of  an  image  from  either  of  two  television 


cameras  and  a simultaneous  output  to  a display.  High  speed 
block  transfers  to  and  from  host  computer  memory  are  used. 

The  host  computer  has  control  over  the  acquired  section  size 
(currently  limited  to  4096  pixels),  resolution,  format  and 
position  within  the  television  image  frame.  All  transfers 
to  and  from  the  buffer  memory  system  can  take  place  at 
100  nsec/byte  transfer  rates.  In  addition,  a run-length 
encoder-controller  has  been  designed  so  that  the  buffer  memory 
may  be  used  for  this  purpose. 

Since  the  image  input  facility  has  not  been  completed 
as  yet,  software  was  written  to  evaluate  the  performance  of 
different  run  length  coding  techniques  on  images  stored  on 
magnetic  tape.  The  color  display  capabilities  of  the  Signal 
Processing  Laboratory  were  quite  useful  in  the  interactive 
design  and  evaluation  process. 


VI.  Publications 

To  date,  the  description  of  work  performed  and  ideas 
and  methods  developed  appears  in  several  Signal  Processing 
Laboratory  Reports  [4,5,6,71.  An  expanded  version  of  the 
depth  determination  discussion  in  [4]  is  under  preparation 
for  submission  for  journal  publication  in  the  future.  The 
results  of  actual  implementation  of  these  ideas  will  be 
included  in  that  publication.  A discussion  of  these  results 
will  be  submitted  for  presentation  at  appropriate  conferences 
in  the  future. 
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VII.  Best  Laid  Plans... 

It  is  not  possible  in  advance  to  be  certain  of  the 
results  or  even  of  the  difficulties  to  be  encountered  in 
investigation  of  approaches  to  research  problems.  Several 
problem  areas  which  were  tackled  during  the  course  of  this 
research  failed  to  yield  encouraging  results. 

It  was  initially  hoped  that  some  method  of  using  the 
Walsh-Hadamard  transform  as  an  aid  to  cross-correlation  of 
image  subareas  could  be  found.  It  was  hoped  that  the  compu- 
tational advantages  of  this  transform  over  the  demonstrably 
useful  Fourier  transform  could  be  put  to  use.  The  work  of 
Parkyn  [8,9]  has  shown  that  it  is  not  possible  to  correlate 
unequal  sized  areas  usfftg  the  Walsh  transform  and  that  the 
software  overhead  required  discourages  its  use  otherwise. 

This  result  put  an  end  to  hopes  for  speeding  computation  and 
halted  efforts  to  develop  a method  for  transforming  run  length 
coded  data. 

An  effort  to  develop  a technique  similar  to  the  Fast 
Fourier  Transform  but  specifically  applicable  to  run  length 
coded  data  has  been  put  off  to  the  future.  Perhaps  time  will 
allow  a re-opening  of  this  investigation  later. 

It  was  somewhat  disappointing  that  progress  could  not 
proceed  at  a greater  rate  on  implementation  of  a working 
system  for  stereoscopic  depth  determination.  Hardware  con- 
struction and  software  development  has  been  slowed  down  by 
the  increasing  usage  and  decreasing  reliability  of  the  computer 
facilities  within  the  Signal  Processing  Laboratory.  The 
facilities  allow  highly  Interactive  development  work  and  great 
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freedom  for  modifications  but  the  major  CPU,  an  Adage  AGT-30, 
is  primarily  a discrete  device  system  of  the  mid-sixties  and, 
unfortunately,  is  requiring  more  and  more  maintenance. 

Finally,  it  was  somewhat  frustrating  to  confront  the 
reality  of  serial  computation  in  a three-dimensional  domain. 
Future  advances  in  associative,  array-oriented  computer  archi- 
tectures will  certainly  be  welcomed  by  those  involved  in  scene 
and  image  analysis.  Perhaps  then  the  problem  of  performing 
a multi-dimensional  operation  will  not  seem  so  time-consuming 
and  so  awkward. 
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